Active ensemble learning: Application to data mining and bioinformatics

نویسندگان

  • Hiroshi Mamitsuka
  • Naoki Abe
چکیده

This paper describes a new set of learning procedures which have been proposed by the authors. The method combines active learning and the accuracy enhancement techniques of bagging and boosting, and may be called active ensemble learning. Any of these procedures achieves highly accurate learning by iteratively selecting (querying) a small amount of data with large information content from a data space or database. This paper describes not only the technical aspect of the method, but also the results of application to two real problems, namely, active planning of biochemical or molecular biological experiments in immunology, and customer segmentation from a large-scale body of data in the CRM (customer relationship management) field. It is demonstrated that the proposed methods can achieve greater data efficiency and prediction accuracy than conventional methods. © 2007 Wiley Periodicals, Inc. Syst Comp Jpn, 38(11): 100–108, 2007; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/scj.10355

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Diversified Ensemble Classifiers for Highly Imbalanced Data Learning and their Application in Bioinformatics

In this dissertation, the problem of learning from highly imbalanced data is studied. Imbalance data learning is of great importance and challenge in many real applications. Dealing with a minority class normally needs new concepts, observations and solutions in order to fully understand the underlying complicated models. We try to systematically review and solve this special learning task in t...

متن کامل

High-Dimensional Unsupervised Active Learning Method

In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...

متن کامل

The prediction of lymphedema via the combination of the selected data mining algorithms

Background: Breast cancer is the second leading cause of cancer death in women, after lung cancer. Due to the importance of predicting this disease, the use of data mining methods in medical research is more significant than before. Data mining algorithms can be a great help in preventing the development of lymphedema in patients. The aim Of this study was to create a diagnosis system that can ...

متن کامل

A Novel Ensemble Approach for Anomaly Detection in Wireless Sensor Networks Using Time-overlapped Sliding Windows

One of the most important issues concerning the sensor data in the Wireless Sensor Networks (WSNs) is the unexpected data which are acquired from the sensors. Today, there are numerous approaches for detecting anomalies in the WSNs, most of which are based on machine learning methods. In this research, we present a heuristic method based on the concept of “ensemble of classifiers” of data minin...

متن کامل

Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection

Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Systems and Computers in Japan

دوره 38  شماره 

صفحات  -

تاریخ انتشار 2007